Molecular determinants of TRAF6 binding specificity suggest that native interaction partners are not optimized for affinity

Abstract TRAF6 is an adaptor protein involved in signaling pathways that are essential for development and the immune system. It participates in many protein–protein interactions, some of which are mediated by the C‐terminal MATH domain, which binds to short peptide segments containing the motif PxExx[FYWHDE], where x is any amino acid. Blocking MATH domain interactions is associated with favorable effects in various disease models. To better define TRAF6 MATH domain binding preferences, we screened a combinatorial library using bacterial cell‐surface peptide display. We identified 236 of the best TRAF6‐interacting peptides and a set of 1,200 peptides that match the sequence PxE but do not bind TRAF6 MATH. The peptides that were most enriched in the screen bound TRAF6 tighter than previously measured native peptides. To better understand the structural basis for TRAF6 interaction preferences, we built all‐atom structural models of the MATH domain in complex with high‐affinity binders and nonbinders identified in the screen. We identified favorable interactions for motif features in binders as well as negative design elements distributed across the motif that can disfavor or preclude binding. Searching the human proteome revealed that the most biologically relevant TRAF6 motif matches occupy a different sequence space from the best hits discovered in combinatorial library screening, suggesting that native interactions are not optimized for affinity. Our experimentally determined binding preferences and structural models support the design of peptide‐based interaction inhibitors with higher affinities than endogenous TRAF6 ligands.


Supplementary Information
Halpin et al., Molecular determinants of TRAF6 binding specificity suggest that native interaction partners are not optimized for affinity

Stepwise protocol for amplicon generation of Illumina substrates
Sorted pools from the enrichment and nonbinder experiments were grown overnight in 10 mL LB + 25 µg/mL chloramphenicol (OD600 > 1.0) and then plasmid DNA from each pool was isolated (QIAprep miniprep kit using manufacturer's instructions). The resulting DNA pools were subjected to the following procedure (see Figure S3): 1. PCR1 amplified the variable region of the bulk plasmid DNA and attached a 5' top strand overhang containing 12 nt of flanking DNA and a TCCACC MmeI recognition sequence. MmeI cuts 20 nt 3' of its recognition site on the top strand and 18 nt 5' on the bottom strand. The MmeI site appended in PCR1 in our construct was designed such that MmeI digestion results in a bottom strand 3' overhang of 'AG' to match the overhang of our DNA adapters. If other adapter is chosen, make sure to generate the appropriate overhang. The top strand 3' primer appends both a 9 nt unique identifier (UID) and a 6 nt index sequence. The appended UIDs in our analysis were not used in this work. After 5 cycles at Ta = 60 °C, the next 20 cycles were run at Ta = 66 °C such that only full-length fragments were reproduced. Phusion High-Fidelity Polymerase in HF Buffer was used (0.5 µL Phusion/50 µL reaction) for all PCR steps. The 6 nt index sequences that we used are indicated in Table S1. PCR products were purified with the Zymo DNA Clean and Concentrate Kit (Genesee).  (Table S1) assigned by pool identity. The adapters, which contain both the 5' Illumina forward sequencing primer sequence and a 5' 5-nt barcode, each having a 'TC' top-strand 3' overhang for annealing to the designed 'AG' bottom-strand 3' overhang left by MmeI cleavage of purified PCR1 products. Ligated products were then run on a 1% Agarose gel containing 10 µL GelGreen/100 mL total gel volume and bands ~117 nt were excised and purified with the Zymoclean Gel DNA Recovery Kit. Purified products were eluted in 20 µL water. TRAF6-binding SLiMs must be accessible for binding. The hits in our table can be filtered by IUPred score [1] to only include hits that are predicted to be disordered (e.g. IUPred score > 0.4). However, IUPred score is not a guarantee of accessibility. The AlphaFold pLDDT score is reported to be a good predictor of disorder [2], so we included the average and maximum AlphaFold pLDDT scores of the motif (+/-3 flanking residues) within the predicted structure of the protein [2][3][4]. For the average score, we recommend a cutoff of < 65 but caution that this will likely remove some instances where a motif is still accessible, despite the high pLDDT score. The maximum pLDDT score of any residue within the motif +/-3 residues on each side is also reported, as this may detect cases in which most of the motif is disordered but proximity to a folded domain structure limits accessibility. For this filter, we recommend setting it to consider hits with a maximum pLDDT score of less than 70.
Proteins involved in similar biological processes as TRAF6 are more promising candidate interaction partners [5]. To identify MATH domain binding motifs in proteins that share functions in common with TRAF6, we used Gene Ontology (GO) annotations [6,7]. Specifically, we used SLiMSearch to retrieve GO terms for TRAF6, where each term has an associated p-value representing the likelihood that any 2 proteins in the proteome share that term by chance (p-value from SLiMSearch [5]). The table of proteome hits can then be filtered to include proteins that share 1 or more TRAF6 GO terms with a p-value below a given threshold. Smaller p-value cutoffs result in more general GO terms being removed from the list of terms used in the filter and provide greater stringency. We provide several different p-values as options and suggest p=0.01 as a starting point.
Many proteins have been reported to interact with TRAF6 without identification of the mode of interaction. We used the HIPPIE database [8] to identify which of the proteome windows that we evaluated are in proteins that are already annotated to be TRAF6 interaction partners. A high-scoring motif match in a protein annotated as a TRAF6 interaction partner could indicate that its interaction with TRAF6 is likely to occur through the MATH domain.
Our structural analysis identified several sequence features that disfavor or prevent PxE peptides from binding to the MATH domain (Table 2). We added filters to the table to identify candidate motifs having unfavorable residues, such as a proline at positions (+1) to (+5), a large/medium residue (QHILFYW) at position (+3), or a positively charged residue (RK) at positions (+3), (+4), or (+5). Figure S1. Binding curves from single-clone FACS titrations. Mean PE fluorescence is plotted against TRAF6 concentration and fit to a standard binding equation (equation 1). The mean ! * from fitting each replicate independently and the associated standard error of the mean is shown above each graph.  , and RNF103 (C), homologous proteins were retrieved using homologene [9]. Orthologs from a representative set of vertebrates were selected and their sequences were aligned using Clustal-Omega [10]. Multiple sequence alignment images were generated using Jalview [11]. Motifs with sequence features that our model predicts are favorable for high-affinity binding to TRAF6 are indicated in purple boxes. For RIPK1, putative TIM6 motifs that contain Glu rather than Trp at (+5) are indicated in the blue box.